156 research outputs found

    Conformational Proofreading: The Impact of Conformational Changes on the Specificity of Molecular Recognition

    Get PDF
    To perform recognition, molecules must locate and specifically bind their targets within a noisy biochemical environment with many look-alikes. Molecular recognition processes, especially the induced-fit mechanism, are known to involve conformational changes. This raises a basic question: Does molecular recognition gain any advantage by such conformational changes? By introducing a simple statistical-mechanics approach, we study the effect of conformation and flexibility on the quality of recognition processes. Our model relates specificity to the conformation of the participant molecules and thus suggests a possible answer: Optimal specificity is achieved when the ligand is slightly off target; that is, a conformational mismatch between the ligand and its main target improves the selectivity of the process. This indicates that deformations upon binding serve as a conformational proofreading mechanism, which may be selected for via evolution

    From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions

    Get PDF
    ©2009 Gao, Skolnick. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.doi:10.1371/journal.pcbi.1000341DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Ca deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein

    Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

    Get PDF
    We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids

    Local Gene Regulation Details a Recognition Code within the LacI Transcriptional Factor Family

    Get PDF
    The specific binding of regulatory proteins to DNA sequences exhibits no clear patterns of association between amino acids (AAs) and nucleotides (NTs). This complexity of protein-DNA interactions raises the question of whether a simple set of wide-coverage recognition rules can ever be identified. Here, we analyzed this issue using the extensive LacI family of transcriptional factors (TFs). We searched for recognition patterns by introducing a new approach to phylogenetic footprinting, based on the pervasive presence of local regulation in prokaryotic transcriptional networks. We identified a set of specificity correlations –determined by two AAs of the TFs and two NTs in the binding sites– that is conserved throughout a dominant subgroup within the family regardless of the evolutionary distance, and that act as a relatively consistent recognition code. The proposed rules are confirmed with data of previous experimental studies and by events of convergent evolution in the phylogenetic tree. The presence of a code emphasizes the stable structural context of the LacI family, while defining a precise blueprint to reprogram TF specificity with many practical applications.Ministerio de Ciencia e Innovación, Spain (Formación de Profesorado Universitario fellowship)Ministerio de Ciencia e Innovación, Spain (grant BFU2008-03632/BMC)Madrid (Spain : Region) (grant CCG08-CSIC/SAL-3651

    Probing the Informational and Regulatory Plasticity of a Transcription Factor DNA–Binding Domain

    Get PDF
    Transcription factors have two functional constraints on their evolution: (1) their binding sites must have enough information to be distinguishable from all other sequences in the genome, and (2) they must bind these sites with an affinity that appropriately modulates the rate of transcription. Since both are determined by the biophysical properties of the DNA–binding domain, selection on one will ultimately affect the other. We were interested in understanding how plastic the informational and regulatory properties of a transcription factor are and how transcription factors evolve to balance these constraints. To study this, we developed an in vivo selection system in Escherichia coli to identify variants of the helix-turn-helix transcription factor MarA that bind different sets of binding sites with varying degrees of degeneracy. Unlike previous in vitro methods used to identify novel DNA binders and to probe the plasticity of the binding domain, our selections were done within the context of the initiation complex, selecting for both specific binding within the genome and for a physiologically significant strength of interaction to maintain function of the factor. Using MITOMI, quantitative PCR, and a binding site fitness assay, we characterized the binding, function, and fitness of some of these variants. We observed that a large range of binding preferences, information contents, and activities could be accessed with a few mutations, suggesting that transcriptional regulatory networks are highly adaptable and expandable

    Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms

    Get PDF
    Baumbach J, Rahmann S, Tauch A. Reliable transfer of transcriptional gene regulatory networks between taxonomically related organisms. BMC Systems Biology. 2009;3(1):8.Background: Transcriptional regulation of gene activity is essential for any living organism. Transcription factors therefore recognize specific binding sites within the DNA to regulate the expression of particular target genes. The genome-scale reconstruction of the emerging regulatory networks is important for biotechnology and human medicine but cost-intensive, time-consuming, and impossible to perform for any species separately. By using bioinformatics methods one can partially transfer networks from well-studied model organisms to closely related species. However, the prediction quality is limited by the low level of evolutionary conservation of the transcription factor binding sites, even within organisms of the same genus. Results: Here we present an integrated bioinformatics workflow that assures the reliability of transferred gene regulatory networks. Our approach combines three methods that can be applied on a large-scale: re-assessment of annotated binding sites, subsequent binding site prediction, and homology detection. A gene regulatory interaction is considered to be conserved if (1) the transcription factor, (2) the adjusted binding site, and (3) the target gene are conserved. The power of the approach is demonstrated by transferring gene regulations from the model organism Corynebacterium glutamicum to the human pathogens C. diphtheriae, C. jeikeium, and the biotechnologically relevant C. efficiens. For these three organisms we identified reliable transcriptional regulations for similar to 40% of the common transcription factors, compared to similar to 5% for which knowledge was available before. Conclusion: Our results suggest that trustworthy genome-scale transfer of gene regulatory networks between organisms is feasible in general but still limited by the level of evolutionary conservation

    From DNA sequence to application: possibilities and complications

    Get PDF
    The development of sophisticated genetic tools during the past 15 years have facilitated a tremendous increase of fundamental and application-oriented knowledge of lactic acid bacteria (LAB) and their bacteriophages. This knowledge relates both to the assignments of open reading frames (ORF’s) and the function of non-coding DNA sequences. Comparison of the complete nucleotide sequences of several LAB bacteriophages has revealed that their chromosomes have a fixed, modular structure, each module having a set of genes involved in a specific phase of the bacteriophage life cycle. LAB bacteriophage genes and DNA sequences have been used for the construction of temperature-inducible gene expression systems, gene-integration systems, and bacteriophage defence systems. The function of several LAB open reading frames and transcriptional units have been identified and characterized in detail. Many of these could find practical applications, such as induced lysis of LAB to enhance cheese ripening and re-routing of carbon fluxes for the production of a specific amino acid enantiomer. More knowledge has also become available concerning the function and structure of non-coding DNA positioned at or in the vicinity of promoters. In several cases the mRNA produced from this DNA contains a transcriptional terminator-antiterminator pair, in which the antiterminator can be stabilized either by uncharged tRNA or by interaction with a regulatory protein, thus preventing formation of the terminator so that mRNA elongation can proceed. Evidence has accumulated showing that also in LAB carbon catabolite repression in LAB is mediated by specific DNA elements in the vicinity of promoters governing the transcription of catabolic operons. Although some biological barriers have yet to be solved, the vast body of scientific information presently available allows the construction of tailor-made genetically modified LAB. Today, it appears that societal constraints rather than biological hurdles impede the use of genetically modified LAB.

    PDNAsite:identification of DNA-binding site from protein sequence by incorporating spatial and sequence context

    Get PDF
    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community

    A Linear Model for Transcription Factor Binding Affinity Prediction in Protein Binding Microarrays

    Get PDF
    Protein binding microarrays (PBM) are a high throughput technology used to characterize protein-DNA binding. The arrays measure a protein's affinity toward thousands of double-stranded DNA sequences at once, producing a comprehensive binding specificity catalog. We present a linear model for predicting the binding affinity of a protein toward DNA sequences based on PBM data. Our model represents the measured intensity of an individual probe as a sum of the binding affinity contributions of the probe's subsequences. These subsequences characterize a DNA binding motif and can be used to predict the intensity of protein binding against arbitrary DNA sequences. Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge. For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles. Our approach for TF identification achieved the best performance in the bonus challenge

    Bacteriophage Crosstalk: Coordination of Prophage Induction by Trans-Acting Antirepressors

    Get PDF
    Many species of bacteria harbor multiple prophages in their genomes. Prophages often carry genes that confer a selective advantage to the bacterium, typically during host colonization. Prophages can convert to infectious viruses through a process known as induction, which is relevant to the spread of bacterial virulence genes. The paradigm of prophage induction, as set by the phage Lambda model, sees the process initiated by the RecA-stimulated self-proteolysis of the phage repressor. Here we show that a large family of lambdoid prophages found in Salmonella genomes employs an alternative induction strategy. The repressors of these phages are not cleaved upon induction; rather, they are inactivated by the binding of small antirepressor proteins. Formation of the complex causes the repressor to dissociate from DNA. The antirepressor genes lie outside the immunity region and are under direct control of the LexA repressor, thus plugging prophage induction directly into the SOS response. GfoA and GfhA, the antirepressors of Salmonella prophages Gifsy-1 and Gifsy-3, each target both of these phages' repressors, GfoR and GfhR, even though the latter proteins recognize different operator sites and the two phages are heteroimmune. In contrast, the Gifsy-2 phage repressor, GtgR, is insensitive to GfoA and GfhA, but is inactivated by an antirepressor from the unrelated Fels-1 prophage (FsoA). This response is all the more surprising as FsoA is under the control of the Fels-1 repressor, not LexA, and plays no apparent role in Fels-1 induction, which occurs via a Lambda CI-like repressor cleavage mechanism. The ability of antirepressors to recognize non-cognate repressors allows coordination of induction of multiple prophages in polylysogenic strains. Identification of non-cleavable gfoR/gtgR homologues in a large variety of bacterial genomes (including most Escherichia coli genomes in the DNA database) suggests that antirepression-mediated induction is far more common than previously recognized
    corecore